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Foreword 



Jeffrey A. Owings, Associate Commissioner 
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Addressing the theme, “How to measure school performance in a tangible way,” scholars in the field of education 
finance presented their thinking at the 1998 National Center for Education Statistics (NCES) Summer Conference. 
The implicit questions posed by all of the presentations revolve around the current and future financial status of 
school districts, how to portray that condition, and the significance of that standing for school performance. 

Developments in School Finance contains papers presented at the annual NCES Summer Data Conference. This 
Conference attracts several state department of education policymakers, fiscal analysts, and fiscal data providers 
from each state, who are offered fiscal training sessions and updates on developments in the field of education 
finance. The presenters are experts in their respective fields, each of whom has a unique perspective or interesting 
quantitative or qualitative research regarding emerging issues in education finance. The reaction of those who 
attended the Conference was overwhelmingly positive. We hope that will be your response as well. 

This proceedings is the fifth education finance publication from the NCES Summer Data Conference. The papers 
included within present the views of the authors, and are intended to promote the exchange of ideas among research- 
ers and policymakers. No official support by the U.S. Department of Education or NCES is intended or should be 
inferred. Nevertheless, NCES would be pleased if the papers provoke discussions, replications, replies, and refuta- 
tions in future Summer Conferences. 
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In troduction and Ove rview 



William J. Fowler, Jr. 

National Center for Education Statistics 



The learned education finance researchers who presented 
at the 1998 National Center for Education Statistics 
(NCES) Summer Conference examined both current 
theoretical perspectives on public school finance and the 
everyday policy concerns of schools. Thus, the first four 
articles in this volume reflect policy studies of how to 
measure school performance in a tangible manner. The 
last three articles serve as theoretical explorations of cut- 
ting-edge research in the field of school finance. The 
implicit questions posed by all of these articles revolve 
around the current and future financial status for school 
districts, how to portray that condition, and the signifi- 
cance of that standing for school performance. 

The first paper asserts how difficult it is to measure a 
school’s productivity, particularly if one is concerned with 
more than test scores, such as constructive employment 
for graduates. Measuring school productivity is made 
more difficult because Americas schools seek to achieve 
a wide array of inherently difficult-to-measure student 
outcomes, such as responsible citizenship. The next pa- 
per makes no effort to measure productivity in the eco- 
nomic tradition. Instead, it attempts to usefully com- 
pare a school district with similar peer school districts, in 
order to benchmark spending and performance. A third 
paper explores potential threats to the measurement of 
public school productivity, such as removing the most 
talented students through implementation of school 
voucher programs. Another scholar focuses attention on 
a natural experiment in Chicago’s public schools, in which 
responsibility for the distribution of funding was shifted 



from the district level to the school level. This paper 
then examines school funding and its relationship to stu- 
dent achievement outcomes. 

Two papers on the cutting edge of school finance research 
seek to better assess fiscal equalization. One employs 
hierarchical linear modeling (HLM), one of the most so- 
phisticated statistical analytical tools available to educa- 
tional researchers. HLM permits analysts to sort out dif- 
ferences that occur within and between organizational 
units, such as schools and school districts. The other 
paper explores methods of presenting financial data in 
graphic displays that permit widely diverse audiences, 
such as state legislators and parents, to quickly grasp 
meaning that can be used to improve school performance. 

A final cutting-edge paper describes how the NCES Early 
Childhood Longitudinal Survey for Kindergarten (ECLS- 
K) has incorporated a student-level finance measure, 
which potentially has the ability to most accurately as- 
sess questions of the relationship of resources to student 
outcomes. In addition, observers in the sample of Kin- 
dergarten schools assess the adequacy of the school fa- 
cilities, and their subjective assessment of the learning 
climate of the school. To education finance researchers, 
these data represent the pinnacle of the information pyra- 
mid, where information on students, parents, teachers, 
and schools can all be combined in a myriad of combi- 
nations to answer the multitude of questions in the field 
of education. 
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To summarize, the papers from this years presentations 
sought to explore how to assess the productivity of school- 
ing, whether through sophisticated econometric or sta- 
tistical procedures, or simple comparison. Let us now 
turn to the specifics of each paper. 

The publication begins with an overview of the difficulty 
in measuring a schools productivity. Richard Rothstein, 
of the Economic Policy Institute, asserts that the 
politicization of “educational productivity” has caused a 
mismatch in the way educational goals (inputs) are de- 
fined and the method by which results (outputs) are as- 
sessed. Although some products of an education system 
have concrete methods of assessment, such as test scores, 
it is a challenge to define many educational goals because 
the outputs of schools are, to some extent, inherently not 
measurable. Also, building a relationship between rec- 
ognizable outputs and latent inputs is a complex process. 

As a basis for this statement, Rothstein oudines the Na- 
tional Education Goals set forth by the National Educa- 
tion Goals Panel, which are a mix of concrete and ab- 
stract goals for the Nations education system. Unfortu- 
nately, not all of these goals can be measured using tradi- 
tional methods (e.g., proficiency test scores). More ab- 
stract goals, such as being prepared for responsible citi- 
zenship, are difficult to measure. It is important to rec- 
ognize the effect these different types of goals have on 
each other, as they are not independent entities. 

Rothstein argues that school finance scholars need to al- 
ter their focus from attaining more accurate calculations 
of school productivity to broadening how school pro- 
ductivity is gauged, and the type of data required. His 
examination of how “then and now” studies of educa- 
tion are conducted helps the reader place the satisfaction 
with the nations public schools in historical perspective. 
These studies were initiated as a response to the often- 
heard public cry of “todays schools do not measure up to 
the standards of the past.” Rothstein postulates that al- 
though these studies may have been methodologically 
unsound, chronicling these types of analyses provides 
meaning to the unchanging debate about the quality of 
American education. He concludes that, due to rapidly 
changing demographics, the increasing ineffectiveness of 
public school productivity studies over time emphasizes 
the need for the measurement of a broad range of school 
outputs and a more effective matching of school inputs 
and latent outputs. 



Elizabeth Greenberg and John Guarnera, of American 
Institutes for Research, seek to make pragmatic 
benchmarking comparisons between comparable school 
districts with extant data. They investigate education fi- 
nancing and outcomes in Philadelphia and make com- 
parisons among other big city school districts, other Penn- 
sylvania school districts, and between cities and states. 

Greenberg and Guarnera focus many of their compari- 
sons on school-level characteristics, such as current ex- 
penditures, student/ teacher ratios, student achievement, 
percentage of budget spent on instruction, dropout rates, 
and teacher absences. Furthermore, they examine the ra- 
tio of city school district administrative expenditures to 
instructional expenditures and include comparisons on 
demographic characteristics, such as population loss and 
poverty level. 

Each comparison made provides the reader with greater 
insight into the complexity of the relationship of schools 
between or within cities and states. Comparing specific 
education indicators within a state is usually an easy task. 
However, while comparison among states is often desir- 
able, it is difficult to make these comparisons because of 
the inconsistencies among the methods states use to cal- 
culate these indicators. Although states may measure 
spending differently, accurate comparisons of spending 
patterns can be calculated. For example, a city may be 
compared to its states average, and then comparable city 
school districts compared after this ratio is calculated. 

Greenberg and Guarnera demonstrate that such compari- 
sons are difficult to perform, and require the application 
of much judgement. Although NCES has an Internet 
tool to compare school districts (see http://nces.ed.gov/ 
edfin/search/search_intro.htm), Greenberg and Guarnera 
found they had to go far beyond this primitive web tool. 

Dan D. Goldhaber, of the Urban Institute, Dominic J. 
Brewer, of RAND, and Eric R. Eide, of Brigham Young 
University, examine the issue of the creaming of academi- 
cally talented public school students, encouraging them 
to move to private schools. As a setting for their research, 
Goldhaber, Brewer, and Eide examined the Milwaukee 
Parental Choice Program (MPCP). Opponents of choice 
programs often argue that such programs will exacerbate 
the flight of academically talented students from the pub- 
lic schools. 
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Prior research, which analyzed mathematics and reading 
test scores, has resulted in widely differing conclusions 
about the outcomes of participating in the Milwaukee 
choice program. Goldhaber, Brewer, and Eide believe 
that there are important differences in student character- 
istics between public and private schools. Some of these 
differences are easily observable and, therefore, easily 
measurable; however, there are unobservable differences 
that can cause biased estimates of the effects of private 
schooling. For example, parents with academically tal- 
ented children might choose private sector schooling for 
their children more often than choosing to send their 
children to the Milwaukee public schools. Those par- 
ents who choose private sector schooling may be differ- 
ent in some subtle way from parents who choose public 
sector schooling. These differences are difficult to define 
and measure empirically, particularly if parental attributes 
are combined in datasets in some aggregate manner. 

The design of the Milwaukee choice program may help 
researchers evaluate the existence of selection bias. To 
qualify for this program, students must meet poverty 
guidelines, not be enrolled in a private school in the im- 
mediate prior year, and be eligible to attend private, non- 
sectarian schools in the school district. As a result of the 
qualification criteria, the number of students admitted 
to this program is limited. If schools receive more appli- 
cants than their enrollment limit, they are required to 
accept students based on random selection. This ran- 
dom assignment of students permits researchers to com- 
pare students who applied and were not accepted with 
those who applied and were accepted with less concern 
about sample selection bias. Because all public school 
students who met the criteria for the program had the 
choice to participate, the Milwaukee program is espe- 
cially useful for research into the issue of selection bias. 
However, Goldhaber, Brewer, and Eide conclude that the 
data yield is little indication that sample selection or 
creaming is associated with the decision to participate in 
the Milwaukee choice program. 

Many in the school finance community recommend shift- 
ing responsibility for the distribution of funding from 
the district level to the school level. This decentralized 
system of allocation is being adopted in many schools 
throughout the country; however, there has not been 
much research into the methods by which school-level 
allocations are determined. Ross Rubenstein, of Geor- 
gia State University, examines the impact of such a shift 
on the distribution of funding to Chicago schools. 



Rubensteins article delves into this issue of allocation and 
how differences between traditional district-level and 
newer school-level allocations affect performance equity 
among schools. 

In the late 1980s, Chicago public schools were decen- 
tralized in order to shift the responsibilities for gover- 
nance and improvement to the individual schools. This 
decentralized school system provides Rubenstein with a 
solid location for analyses of differences in expenditure 
patterns and student performance across schools. 

When comparing general fund spending patterns with 
school performance, Rubenstein finds that the spending 
patterns among high and low performing schools are simi- 
lar. Although small differences exist, a pattern does not 
emerge. This result is not surprising, for general fund 
allocations must be used to provide the same basic ser- 
vices to all students. A comparison of Chapter I spend- 
ing reveals a more interesting pattern. Although both 
elementary and high schools spend the greatest propor- 
tion of their Chapter I funds on instruction, high schools 
distribute remaining funds evenly across all functional 
areas, whereas elementary schools tend to focus on in- 
structional support and administration. Overall, when 
considering all types of funding and both elementary and 
secondary schools, Rubenstein finds a consistent pattern: 
Although there is not much variation in total spending 
among schools, there is variation in decisions concern- 
ing discretionary funds. 

Patrick Galvin, Hal Robins, and Karen Callahan, of the 
University of Utah, examine the issue of school finance 
equalization by using hierarchical linear modeling 
(HLM). In their study, HLM is applied to control for 
differences in school district characteristics. Under the 
assumption that schools are hierarchically nested, two 
schools could appear exactly alike yet perform differently 
due to other contextual variables. The HLM method 
assists researchers by controlling for these contextual vari- 
ables. 

This article is based on the assumption that the underly- 
ing goal of school finance equalization is to promote edu- 
cational achievement. Although there is concern about 
the fair distribution of resources, Galvin, Robins, and 
Callahan assert that the goal of equalization efforts is to 
equalize educational opportunities and outcomes. They 
propose that there is a mismatch in policy goals between 
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distribution of educational resources (equity) and con- 
cern about the use of these resources (productivity). 

The authors find many advantages when using HLM to 
analyze school finance equalization. This method of 
analysis develops a relationship between distribution of 
inputs and an outcome measure. In addition, HLM per- 
mits the researcher to concurrently examine the effect of 
predictor variables on both the average intercept and slope 
of the relationship variable. The HLM model also per- 
mits exploration of interactions, and confounding effects 
of variables. For example, it is 

“...one thing to operate as a high-need school 
within a relatively wealthy environment and 
quite another to operate as a high-need school 
in a relatively poor environment.” 

Galvin, Robins, and Callahan seek to learn the ways in 
which schools, classrooms, and students function com- 
pared with their resource environment. By using the 
HLM model, the authors learn that the effects of resources 
can be hidden by interaction with other variables and 
that more extensive research into these interactions is 
needed in order to determine the relationship between 
educational resources and achievement. They find that 
one of the primary contributions of the use of HLM is 
that it changes the focus of the research from inequity in 
resources to the relationship between resources and out- 
comes. 

The school finance research community often forgets that 
their audience is composed of policymakers, parents, and 
the general public. Because of the wide range of sophis- 
tication of these diverse audiences, research outcomes 
must be presented in concise, understandable ways. Larry 
Toenjes, of the University of Houston, in the final ar- 
ticle of this publication, explores active graphics meth- 
ods for the analysis and display of education data. Ad- 
vances in and increased accessibility to personal comput- 
ers have enabled education finance researchers to present 
data analyses in new, previously unimaginable ways. Ad- 
vances in this technology are occurring so quickly that 
staying informed about the latest updates may overwhelm 
the user’s ability to take advantage of what these new tech- 
nologies offer. 

Three of the most popular active graphics display meth- 
ods are described in this article. The first method is link- 



ing, which allows a researcher to present two different 
relationships among two pairs of attributes of the data — 
how a change in one attribute would affect both rela- 
tionships. The second method is brushing, which en- 
ables a researcher to control the position of an outline 
shape on the monitor screen so that the points descend- 
ing within this shape can be visibly modified. Third, 
Toenjes highlights the spinning method, in which a re- 
searcher has the ability to rotate a three-dimensional fig- 
ure around any or all of its axes. 

Toenjes explores some of the software programs that 
employ these and other techniques. The history presented 
for these examples gives the reader a better understand- 
ing of the development and use of these programs. Fol- 
lowing this background information, several sample ap- 
plications and sample active graphics are presented to 
help the reader understand the benefits of real-life appli- 
cations of these programs. For example, Toenjes displays 
the relationship in New York, Indiana, and New Jersey 
between the percent of local tax funds and the expendi- 
ture per pupil. The visual display clearly indicates that 
state aid has effectively equalized expenditures in New 
Jersey for poor school districts, but not in New York or 
Indiana. Using visual display, it is even possible to show 
specific school districts that are rich and poor, and con- 
trast their characteristics, including per pupil expendi- 
tures. Although Toenjes concludes that these visual meth- 
ods cannot replace the traditional techniques, they can 
facilitate understanding and communicating research re- 
sults to a broad audience with wide differences in levels 
of statistical and research sophistication. 

The final article in this publication brings the reader full 
circle. The publication began with a contemplation on 
limits to measure schools’ productivity; it is concluded 
by Lawrence O. Picus and Lauri Peternick’s suggestions 
for improving the Early Childhood Longitudinal Survey 
(ECLS) to enable improved measurement of the elusive 
variables many of the articles in this publication have 
described. Picus, of the University of Southern Califor- 
nia, and Peternick, of American Institutes for Research, 
suggest the ECLS can be a valuable resource for collect- 
ing new finance data that will help uncover the relation- 
ship between student outcomes and resource allocation 
and use. By enhancing this existing survey with ques- 
tions about how money matters in education, research- 
ers would have the opportunity to delve into these latent 
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relationships of resource allocation and their effects on 
student achievement. 

Picus and Peternick outline four broad categories for 
analysis. First, they propose examining classroom costs, 
which include teacher compensation, instructional aide 
compensation, instructional materials, and special pro- 
grams. Second, they propose analyzing school-level costs, 
which include site administration, instructional support, 
student support, maintenance and operation, utilities, and 
transportation. Third, they propose analyzing district- 
level costs in terms of district administration, facilities, 
and data processing. Picus and Peternick also suggest 
analyzing nonschool costs, such as other agency expen- 
ditures and parental support. They conclude by giving 
us their recommendations for the items they think should 
be included in the ECLS. 



Picus and Peternick then turn to the actual ECLS survey 
instruments containing fiscal data, that were devised by 
NCES after field-testing and comment by respondents. 
These survey instruments are mere shadows of the exten- 
sive items Picus and Peternick had recommended, but 
they avoid the excessive burden that would have been 
imposed on respondents by the many items they pro- 
posed. 

As noted by Picus and Peternick, the collection and ulti- 
mate analysis of these data will provide a framework from 
which school districts can make wise resource allocation 
decisions, and in providing insight into understanding 
how and why resources matter. 
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Reflections on the Limitations of our 
Ability to Measure Schools' Productivity, 
and Some Perspective from the Past 

Richard Rothstein 

Economic Policy Institute 



About the Author 



Richard Rothstein is a research associate of the Economic 
Policy Institute, a senior correspondent of The American 
Prospect y and an adjunct professor of public policy at Oc- 
cidental College in Los Angeles. Some material in the 
following article has been drawn from his book, The Way 
We Were ?, published in 1998 by the Century Founda- 



tion, describing trends in measurement of student achieve- 
ment over the last century. Rothsteins recent work on 
school expenditure trends was reported in Wheres the 
Money Going?, a 1997 sequel to Where s the Money Gone ? 
Changes in the Level and Composition of Education Spend- 
ing, both published by the Economic Policy Institute. 
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Reflections on the Limitations of our 
Ability to Measure Schools' Productivity, 
and Some Perspective from the Past 

Richard Ro thstein 

Economic Policy Institute 



Claims about declining educational productivity are 
flawed for at least two reasons, each unremediable with 
present knowledge and data. First, we are quick to mea- 
sure school outputs by math and reading scores, but re- 
quire schools to produce a much broader range of out- 
comes, most of which are unmeasured and some of which 
are unmeasurable. Second, we have such poor longitu- 
dinal data on math and reading performance that claims 
based on declines in this performance rely more on selec- 
tive anecdote than on representative data. Nonetheless, 
these unfounded claims have persisted for the better part 
of this century, with school observers in each era repeat- 
ing the errors of the last. 

The pages that follow attempt to illustrate each of these 
fatal limitations in our measurement of school produc- 
tivity: first, the lack of definition and data relevant to 
many school goals; second, the persistent historical reli- 
ance on anecdote, not data, to support assertions of de- 
clines in the poorly measured outcomes of math and read- 
ing, and the attempts of school defenders to refute these 
anecdotal claims. 

Even today, investigations of “educational productivity” 
are more primitive than many researchers and consum- 
ers of education research are inclined to acknowledge. 

O 
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The intense politicization of public education encour- 
ages making claims which surpass those supportable by 
unambiguous data. There are important public policy 
consequences to conclusions about whether schools have 
growing or declining productivity. Belief in the former 
encourages those seeking greater tax support for public 
schools. Belief in the latter encourages those seeking 
privatization or other radical structural reforms of school- 
ing. 

This heavy political burden borne by conclusions about 
school productivity is not a new phenomenon. With 
inadequate data, public education’s critics have made 
claims regarding schools’ “declining productivity” for 
nearly a century, and defenders have struggled to find 
evidence with which to refute them. It may be that the 
data available to us are too limited and imprecise to sup- 
port legitimate conclusions about trends in education 
productivity. If this is the case, the contributions of quan- 
tifiable education research can play only a limited role in 
these debates. More important roles must be reserved 
for qualitative investigation and for the clarification of 
public values. 

Trends in education productivity may be impossible to 
quantify with certainty because the outputs of schools 
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are mostly unmeasured. True, we devote great scholarly 
resources to measuring a few outputs such as standard- 
ized test scores in reading and mathematics, but there is 
much less interest in examining whether these outputs 
have a predictable relationship to schools’ broad range of 
other outputs. 

What are the outputs we seek from schools? The closest 
we have to an official list is the National Education Goals, 
first adopted by President Bush and the nations gover- 
nors in their 1989 Charlottesville meeting. Each year, 
the National Education Goals Panel produces a report 
on whether we are closer or farther from meeting these 
goals. The report essentially consists of “up” or “down” 
arrows to indicate whether or not progress is being made. 

The following are the goals, as elaborated by the National 
Education Goals Panel: 

■ All children should start school ready to learn; they 
should have been born with an appropriate 
birthweight, should have access to high quality pre- 
school programs, should have parents who have 
access to necessary training and support, and 
should receive the nutrition, physical activity, and 
health needed for mental alertness. 

■ Ninety percent of children should graduate from 
high school, and three-fourths of those who drop 
out will complete an equivalent degree; there will 
be no gap in high school graduation rates between 
minority and white students. 

■ All students should demonstrate competency over 
challenging subject matter (including reasoning, 
problem solving, and communication) in English, 
mathematics, science, foreign languages, civics and 
government, economics, arts, history and geogra- 
phy; they should be prepared for responsible citi- 
zenship (including participation in community 
service and activities demonstrating personal re- 
sponsibility), further learning, and productive em- 
ployment; achievement should improve for stu- 
dents in each quartile of the distribution, and the 
gap between performance of minority and white 
children should narrow; students should have ac- 
cess to physical and health education to ensure 
they are healthy and fit; more students should be 
competent in more than one language; and all stu- 



dents should be knowledgable about the diverse 
heritage of this nation and the world community. 

■ U.S. students will be first in the world in math- 
ematics and science achievement; mathematics and 
science education will be strengthened, including 
use of the metric system of measurement; the num- 
ber of qualified mathematics and science teachers 
will increase; the number of U.S. undergraduate 
and graduate students, especially women and mi- 
norities, who complete degrees in mathematics, 
science and engineering, will increase. 

■ Every adult will be literate, with skills needed to 
compete in a global economy and exercise his/her 
citizenship; connections between education and 
work will be strengthened, and quality public li- 
brary programs for adults will grow in number. 
The proportion of students, especially minorities, 
entering and completing college will increase, and 
college graduates’ critical thinking skills will im- 
prove. Schools themselves will offer more adult 
literacy and parent training programs. 

■ Every school will be free of drugs, violence, fire- 
arms, and alcohol and will teach drug and alcohol 
prevention as an integral part of health education. 
Schools will also eliminate sexual harassment. 

■ All teachers will have access to preservice teacher 
education and continuing professional develop- 
ment to provide the skills needed to teach a cur- 
riculum that can achieve the other goals. 

■ Every school will engage parents in a partnership 
that supports childrens academic work and shared 
educational decision-making at school. 

Even if we ignore items on this list that are not true out- 
puts of schools (for example, whether children are born 
with appropriate birthweights), there are enough real 
outputs to make it evident that mathematics and reading 
scores alone cannot be the numerator in a measure of 
schools productivity. 

Could it be, however, that mathematics and reading scores 
are also proxies for other outputs, so we can presume 
that if mathematics and reading scores rise or decline, 
other outputs will also follow a similar trajectory? Not 
only is this not necessarily the case; the opposite may be 
true. Svstems with complex combinations of goals must 
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always be wary of measuring only a few, because there 
will be powerful pressures on the system to focus on en- 
hancing the production of the most easily measured out- 
puts, at the expense of others that are equally important, 
but more difficult to measure. Everyone has heard stories 
of how “accountability for results” distorted enterprises 
in the old Soviet economy. When shoe factories were 
told they would be held accountable for the number of 
shoes they produced, the factories responded by produc- 
ing only small sizes — gaining rewards for exceeding quo- 
tas without having to purchase more leather. 

As the interest of both policymakers and scholars in school 
productivity outpaces our ability to measure it, Ameri- 
can education faces similar dangers. Consider this dis- 
turbing set of facts: In the last decade, American second- 
ary schools have come under great public and political 
pressure to increase the number of academic courses high 
school students must take to qualify for a diploma. This 
pressure has worked. From 1988 to 
1994, the percentage of school dis- 
tricts requiring at least 4 years of sec- 
ondary school English for a high 
school diploma rose from 80 to 85 
percent; those requiring at least 3 years 
of mathematics rose from 35 to 45 
percent; and those requiring at least 3 
years of science rose from 17 to 25 
percent (U.S. Department of Educa- 
tion 1998, Indicator 26). Most 
Americans consider these data a sign 
of progress, and considered by them- 
selves, they certainly are. But our na- 
tional goals tell us that we also want 
students to have “access to physical 
and health education to ensure they 
are healthv and fit.” Nearly simultaneously with the in- 
crease in academic course-taking, the percentage of stu- 
dents taking a daily high school physical education course 
declined from 42 to 25, and the number of overweight 
adolescents soared (Sammann 1998, 2). We do not know 
why this is the case, but one possibility is that pressure to 
require more mathematics and science courses to improve 
mathematics and science scores has led American high 
schools to find the time for these courses by reducing 
requirements for physical education. This result may not 
be consistent with Americans’ goals for education, which 
include both mathematics and science proficiency at ever 
higher levels, and the development of habits that lead to 
lifelong good health. If we hold schools accountable only 
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for the former, and find it relatively easier to measure 
these, we run the risk of de-emphasizing other less im- 
portant aspects of achievement far more than a balanced 
program would require, no matter how important math- 
ematics and science proficiency may be. 

In truth, we have devoted little energy to attempts to 
measure many of the important outputs of schools. Not 
only do we have no standardized reports of adolescents’ 
physical health, but we have no (or very limited) trend 
data on the national goals of “responsible citizenship,” 
avoidance of drug, alcohol, and tobacco abuse, compe- 
tency in fields like the arts, or “knowledge about the di- 
verse heritage of this nation and the world community,” 
or “participation in community service and activities dem- 
onstrating personal responsibility.” The National As- 
sessment of Educational Progress (NAEP) has begun to 
test a few of these other areas (the new arts and music 
assessment is one example), but few states have high stakes 
tests that go beyond reading and math- 
ematics, so the potential for distortion 
of school production in favor of those 
outputs most easily measured is real. 



Conclusions about school productivity 
require not only measurement of out- 
puts but measurement of inputs. As I 
have argued elsewhere, because we have 
limited ability to match inputs with the 
particular outputs they are designed to 
produce, school productivity becomes 
even a more elusive concept. We might 
make some tentative conclusions about 
school productivity if we tracked math- 
ematics and reading scores and com- 
pared them with changes in the schools 
devoted to enhancing mathematics and reading achieve- 
ment, but data on resources, reported by function and 
object, not by program, are ill-suited for this purpose 
(Rothstein and Miles 1995). 

The aim of this paper is to urge school finance scholars 
to ease off, a little, in the quest for more precise calcula- 
tions of school productivity. Before making existing data 
more precise, more energy should be invested in broad- 
ening the data we require. 

In what follows, I attempt to place the desire for conclu- 
sions about school productivity in historical perspective. 
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We are not the first generation of education researchers 
to confront the need for data to respond to a broad pub- 
lic concern that school productivity is declining. School 
administrators and researchers frequently attempted to 
analyze the persistent claims of declining student perfor- 
mance by conducting “then and now” studies of educa- 
tion. 

In reading through the educational research literature of 
the past, one cannot help but be struck by the consis- 
tently defensive tone of many “then and now” studies: 
professors or school district administrators began their 
published reports by recounting the denunciations of 
schools by politicians, journalists, academics, pundits, or 
parent groups who had claimed that “todays” schools do 
not measure up to the standards of the past, that teachers 
no longer instructed children in basic skills, and that 
young people knew less than they once did. The profes- 
sors or administrators then exclaimed that they had been 
subjected to this abuse long enough, 
and therefore had combed school dis- 
trict archives for tests given to students 
several decades earlier. The researchers 
then described how they had adminis- 
tered these outdated tests to contem- 
porary students, under conditions as 
similar to those of the past as possible. 

The reports frequently concluded by 
showing the contemporary scores to be 
superior, refuting the conventional wis- 
dom of the day. 

While most, though not all, of these 
“then and now” studies were method- 
ologically unsophisticated, and while 
careful social scientists today would 
never sanction such “uncontrolled” (for background char- 
acteristics) research, the history of these studies sheds a 
useful light on the unchanging debates about the quality 
of American education. Also, if we make the not-unrea- 
sonable assumption that demographic change may not 
have been as rapid during the period of these earlier studies 
as it is today, the studies, on the whole, suggested that 
school critics of the past were mistaken. 

There have been several dozen such investigations; I will 
here describe only a few typical examples. The baseline 
was established by Americas very first standardized test, 
administered in 1845 to a select group of 500 of Bostons 
brightest 8th-graders. Results were disappointing. The 



testing committee reported it “difficult to believe that 
there should be so many children ...unable to answer, 

. . .so many absurd answers, so many errors in spelling, in 
grammar, and in punctuation.” Only 45 percent of these 
top 14-year-olds knew that water expands when it freezes. 
When children did answer a question correctly, they fre- 
quently did not understand the answer they had given, 
because, as the examining committee put it, the children 
had been taught “the name of the thing rather than the 
nature of the thing.” According to Massachusetts Secre- 
tary of Public Instruction Horace Mann, Bostons schools 
were ignoring higher-order thinking skills; what little stu- 
dents knew came from memorizing “words of the text- 
book, ...without having ...to think about the meaning 
of what they have learned.” Thus 35 percent knew from 
history classes that, prior to the War of 1812, the United 
States had imposed an embargo on British and French 
shipping, but few had any idea what “embargo” meant. 
In one school, 75 percent of the students knew the date 
of the embargo, but only 5 percent 
could define the term (Caldwell and 

Courtis 1924, 52, 54, 90, 125). 

In 1924, Otis Caldwell, a school direc- 
tor and Columbia Teachers College 
professor in New York City, and Stuart 
Courtis, director of teacher training at 
Detroit Teachers College, noted that 
“[sjurvey after survey has revealed un- 
suspected inadequacy or inefficiency in 
American education,” resulting in 
“[superintendents and teachers [being] 
dismissed” and “school systems and 
methods [being] reorganized.” 
Caldwell and Courtis determined to 
“bring a long-delayed message of en- 
couragement to all who have participated in accomplish- 
ing the educational progress of the last fifty years.” To 
do so, they uncovered the test given by Horace Mann’s 
committee of examiners in 1845, and re-administered 
this 1845 Boston test to a national sample (“from Maine 
to California”) of 8th-graders in 1919 (Caldwell and 
Courtis 1924, v, vi, 8, 9, 77). 

Mann’s test questions that had retained curricular rel- 
evance 75 years later were selected — questions like those 
asking students to describe the “height of a heavenly 
body,” or, “how high can you raise water in a common 
pump, with a single box?,” or about the invasion of 
Canada in the “last war” were dropped. Caldwell and 
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Courtis printed a new exam with these remaining ques- 
tions, and invited school districts to participate. School 
superintendents from 46 states volunteered. Unlike the 
Mann test, which had been given only to the best “brag 
scholars” (students whom Mann described as “the flower 
of the Boston schools”), the superintendents agreed to 
administer the test to all 8th-graders who were present 
on the day the test was given. Twelve thousand exams 
were returned for scoring. 

Caldwell and Courtis found that despite the fact that the 
test in 1919 had been administered to a full range of 8 th- 
graders, not only the brightest as in 1845, the median 
score on the still-relevant questions had been 37.5 per- 
cent correct in 1845, but 45.5 percent in 1919. They 
concluded that children in 1919 did somewhat worse 
than the earlier children in “pure memory” questions, 
and somewhat better on the “thought or meaningful ques- 
tions.” With respect to my earlier example, the research- 
ers reported that “in 1845, 35 percent 
of the children knew the year when the 
embargo was laid by President Jefferson, 
but only 28 percent knew what an em- 
bargo was. In 1919, only 23 percent 
knew the year, but 34 percent knew the 
meaning” (Caldwell and Courtis 1924, 

85-87). 

In 1934, a Los Angeles school re- 
searcher, Elizabeth Woods, gave a 1 924 
6th-grade reading test to students in 33 
elementary schools in which the test 
had been administered ten years earlier. 

She found that scores were half a grade 
higher in 1934 than they had been in 
1924 (Raths and Rothman 1952; Gray 
and Iverson 1954). 

In 1946, Don Rogers, a Chicago assistant school super- 
intendent, tired of hearing “employers... allege that 
present-day pupils (even high school graduates) are not 
proficient. . . The imputation is that . . .our school system 
formerly trained them better than now” (Rogers 1946). 
So Rogers re-administered a 6th-grade Chicago arithmetic 
test from 1923. He found that the 1946 pupils on aver- 
age scored about the same as 1923 pupils (despite the 
unusually high teacher turnover the 1946 students had 
experienced during World War II, and the constant dis- 
ruptions of wastepaper, soap, and other wartime drives 
conducted in schools) and concluded that this test “dis- 



counts the allegations that ...Chicago pupils of an earlier 
generation did better work than their sons and daughters 
who are now in the elementary schools.” 

In 1948, Springfield (Missouri) schools came under at- 
tack from a citizens group for embracing tenets of “pro- 
gressive education” and for ignoring the teaching of ba- 
sic skills, particularly in reading. University of Illinois 
Professors E H. Finch and V. W. Gillenwater undertook 
a study to “reveal whether the teaching of reading had 
increased or decreased in effectiveness,” by re-adminis- 
tering a 1 93 1 6th-grade reading test to contemporary 6th- 
graders in the same Springfield schools. They found that 
1948 students had higher scores, and concluded that 
“Apparently reading instruction ...is now more effec- 
tive... and most sixth grade children now in schools do 
better in reading than did their predecessors” (Finch and 
Gillenwater 1949). While Finch and Gillenwater did 
not use formal statistical controls that we would expect 
in such research today, they superficially 
investigated the characteristics of 193 1 
and 1948 students, and determined 
that the occupational classifications of 
the parents were similar in the two 
years. 

Tests of General Educational Develop- 
ment (GED) are used as an alternative 
high school certification for students 
who drop out. They were originally 
developed in 1943 by the Army to as- 
sess the academic skills of draftees. To 
establish a scoring scale, the Army con- 
tracted for the test to be administered 
to a representative sample of 35,000 
seniors in 814 high schools across the 
country (representative, that is, except that in segregated 
states only white schools were included). In 1955, at a 
time of ferocious public criticism of the public schools 
(and when the belief that schools had deteriorated was as 
widespread as it is today), Army officials wondered if the 
1943 scale was still appropriate. So the Department of 
Defense contracted with the University of Chicago to 
conduct a new study, giving a 1955 GED test to a simi- 
larly representative national group of seniors. Then, a 
smaller sample of students were given both the 1943 and 
1955 tests, so that the scales on the two could be equated. 
The Chicago professor who analyzed the results, Ben- 
jamin Bloom, concluded that “[I]n each of the GED tests 
the performance of the 1955 sample of Seniors is higher 
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than the performance of the 1943 sample. .. [I] n math- 
ematics the average senior tested in 1933 exceeds 58 per- 
cent of the students tested in 1943.” (Average perfor- 
mance also exceeded earlier scores in natural sciences, lit- 
erary materials, English, and social studies.) “These dif- 
ferences are not attributable to chance variation in test 
results,” Bloom concluded, and “indicate that the high 
schools are doing a significantly better job of education 
in 1955 than they were doing in 1943” (Bloom 1956). 

In the early 1950s, Vera Miller and Wendell Lanton, re- 
searchers working for the Evanston (Illinois) school dis- 
trict, noted that parents and educators often charged that 
“too much time [is] being devoted to music, arts, crafts, 
dramatics and unit work [group projects] to the detri- 
ment of the ‘Three R’s.’” In response, Miller and Lanton 
reprinted the standardized reading tests that had been 
given 20 years earlier in the district, and re-administered 
them to contemporary students. Like Finch and 
Gillenwater in Missouri several years 
earlier, they also had no formal statisti- 
cal controls for background character- 
istics, but they noted that the “commu- 
nity was relatively stable [and] present 
day groups of pupils and those of the 
past were similar in most respects... The 
area contains a cross section of people 
of different races and of varied social 
and economic status.” To assure the 
most practically consistent test condi- 
tions, the district’s testing director from 
the 1930s also administered the test in 
the 1950s, using similar procedures. 

The 1950s tests were given on or near 
the same day of the month as the 1 930s 
tests. Miller and Lanton tested 3rd-, 

4th- and 8th-graders from 1952 to 1954, and found that, 
for example, 4th-graders in 1952 scored 6 months higher 
in reading comprehension, and 8 months higher in vo- 
cabulary, than did their 1932 counterparts. “[P] resent 
day pupils read with more comprehension and under- 
stand the meaning of words better than did children who 
were enrolled in the same grades and schools more than 
two decades ago,” Miller and Lanton concluded (Miller 
and Lanton 1956). 

In 1976, the Indiana state Superintendent of Public In- 
struction, Harold Negley, teamed with two Indiana Uni- 
versity professors, Roger Farr and Leo Fay, to examine 
the states reading instruction. Their report notes that in 



"[ T]he charge is some- 
times made that 
children do not read as 
well as in the past and 
that schools are to 
blame." 



1976, “the charge is sometimes made that children do 
not read as well as in the past and that schools are to 
blame” (Farr and Negley 1978). In 1 945, the state had 
administered a standardized reading test to a sample of 
25 percent of the state’s students at each grade level. In 
1976, Farr, Fay, and Negley reprinted the 1945 tests for 
the 6th- and lOth-grades, and re-administered the tests 
to a comparable sample of students. The new sample, 
which included 7 percent of the state’s students in those 
grades, was selected to be representative of the state’s re- 
gional diversity and urban-rural-suburban distribution. 
Thus the demographic of the test-takers in the two time 
periods was as similar as possible. Their raw results re- 
vealed that 6th- and lOth-graders in 1976 read at virtu- 
ally the same grade level as comparable students in 1945. 
For example, the average 1945 6th-grader read at exactly 
the national 6th-grade norm that had been established in 
1943, while the average 1976 6th-grader read at one-tenth 
of one month below the 1943 6th-grade norm. 

The state of Indiana, however, had kept 
unusually good records on the students 
who took the 1945 test, and Farr, Fay, 
and Negley noted that these students 
were considerably older than the 6th- 
and lOth-grade students who took the 
test in 1976. In 1945, it was more com- 
mon not to promote students whose 
achievement was below grade level than 
it was in 1976: in the latter year, for 
example, the 6th-grade included 11- 
and 12-year-olds almost exclusively, but 
in 1945 there had been many 13- and 
1 4-year-olds in the 6th-grade as well. 
In the 1940 census, average Indiana 
6th-graders were 1 2 years and 4 months 
old, but in the 1970 census, they were only 1 1 years and 
6 months old, nearly a full year’s difference in average 
age. Consequently, the older 1945 students had been in 
school more years than the “comparable” 1976 students. 
Further, because fewer students dropped out between 9th- 
and lOth-grade in the later than in the earlier year, the 
1 945 1 Oth-grade students were, on average, higher achiev- 
ers, relative to all young people their age, than were the 
less selective group of 1976 1 Oth-grade students. When 
Farr, Fay, and Negley adjusted their results to compare 
“age equivalent” scores rather than “grade equivalent” 
scores, they found that the 1976 sample, for both 6th- 
and 1 Oth-grade, “outscored the 1945 sample significandv 
on every test.” 
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“[T]he general national assumption that the reading abili- 
ties of our children are decreasing at an alarming rate [is] 
unsupported by this study,” the Indiana researchers con- 
cluded. This “ungrounded alarm,” however, “leads to 
attacks on school programs that have been developed over 
the same time span for which this study shows the im- 
provement in student reading achievement.” 

Over the years, a few “then and now” studies have shown 
declining student achievement: a St. Louis Board of Edu- 
cation study, for example, found that reading achieve- 
ment was slightly less in 1938 than it had been in 1916 
(Boss 1940). However, most of these reports claimed 
improvement, to refute widely publicized attacks on 
schools in each era. 

School officials and researchers no longer publish such 
reports, perhaps because we recognize that in order to 
make reasonable “then and now” comparisons of test 
scores, we require more sophisticated controls than the 
informal demographic similarities noted in the earlier 
studies. Especially because demographic change in many 
districts and schools has been more rapid in recent years, 
we can no longer take “then and now” studies seriously 
without better data on test takers’ parental education. 



occupation, and even income, as well as the childrens 
race and ethnicity, family status, and other socio-economic 
characteristics. These data simply do not exist for past 
test scores, and there is no way to create them. 

Nor was it the case that these old “then and now” studies 
were conducted when it was possible to state without 
equivocation that school productivity could be measured 
solely in the fields of mathematics and reading. Many of 
these studies were conducted during the height of pro- 
gressive educations influence, when the “Americaniza- 
tion” of students and the delivery of a broad range of 
social services through the schools were considered cen- 
tral to their mission. 

Nonetheless, then, as now, there was a public demand 
for higher school standards in reading and mathematics 
(referred to as the “Three Rs”), and the education policy 
community conducted its debates as though reading and 
mathematics were the only goals for which schools could 
be held accountable. Then, as now, the debates were ul- 
timately unsatisfactory. They will continue to be unsat- 
isfactory, until we can measure the broad range of school 
outputs, and match disaggregated inputs to the outputs 
they are designed to achieve. 
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Introduction 

In the spring of 1998, the state of Pennsylvania passed 
legislation giving it the power to take over the city of 
Philadelphia’s public schools because of poor student per- 
formance. This paper was written in response to that 
legislation. It attempts to assess the performance of 
Philadelphia’s public schools using easily available public 
data, in order to determine if the concerns of the state of 
Pennsylvania were justified. 

The first part of the paper compares Philadelphia’s pub- 
lic schools with schools in other large northeastern cities. 
The cities in the comparison data set were chosen be- 
cause they share with Philadelphia two demographic char- 
acteristics that are indicative of cities in trouble: high 
poverty rates and recent population loss. Philadelphia is 
compared with these cities in terms of current expendi- 
tures per pupil, student/teacher ratio, and average SAT 
scores. 

None of the cities in this comparison group is in the 
same state as Philadelphia. Since the United States does 
not have a uniform group of assessments taken by stu- 
dents in different states, it is difficult to compare the 
achievement of Philadelphia’s students with the achieve- 
ment levels of students in these other cities. SATs are 
taken by high school juniors and seniors across the coun- 



try. However, all students do not participate in the SATs; 
thus comparisons based on these tests may be mislead- 
ing. 

Therefore, an additional set of school districts within 
Pennsylvania, whose students were assessed using the same 
tests as the students in Philadelphia, was chosen as a sec- 
ond comparison group for Philadelphia. This group in- 
cluded the second and third largest cities in Pennsylva- 
nia, Pittsburgh and Harrisburg, since large cities often 
face similar problems in their educational systems. It 
also included two of the largest Philadelphia suburban 
school districts, because some educational issues such as 
the supply of qualified teachers vary by geographic area. 
Finally, it included the suburban Philadelphia public 
school district, Chester-Upland, with poverty rates clos- 
est to Philadelphia, as many studies have shown that pov- 
erty levels are one of the demographic characteristics most 
correlated with student achievement. 

The analysis in this paper does not pretend to be a de- 
finitive assessment of the quality of education in Phila- 
delphia. Such an assessment would require more time 
than was available to write this paper, and would also 
require access to data that is not in the public domain. 
The analysis in this paper is only intended to determine 
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if Philadelphia appears to be so outside the realm of ex- 
pected performance that emergency measures are neces- 
sary. 

However, this paper does illustrate a methodology that 
can be applied quickly to any school district to deter- 
mine if its performance is so bad when compared with 
other districts that outside intervention may be neces- 
sary. Any analysis such as this should be followed up by 
more careful study including an examination of curricu- 
lum, facilities, staff qualifications, and other factors. 

Comparisons between Philadelphia and 
Other Big City School Districts 

Population Loss 

In the fall of 1994, 208,000 students were enrolled in 
the Philadelphia School District, placing it among the 
top 10 school districts in the United States in terms of 
size. Although Philadelphia is still large, the city has been 
losing population. Between 1980 and 1992, the popula- 
tion of Philadelphia declined by 8 percent. As illustrated 



in figure 1, this drop in population also occurred in other 
large northern and eastern cities. Because cities that are 
losing population face a declining tax base and similar 
fiscal constraints, we chose this group of population loss 
cities as the comparison set for Philadelphia in this study 

Poverty Level 

In addition to the fact that they are all losing population, 
the set of cities is also characterized by the high percent- 
age of children living below the poverty line. Thirty per- 
cent of children in Philadelphia live in homes with in- 
comes below the poverty line, compared with a low of 25 
percent and a high of 46 percent among other cities in 
the comparison group (figure 2). 

Current Expenditures 

Philadelphia’s current expenditures per pupil are in the 
middle of the group of comparison school districts (fig- 
ure 3). In 1992-93, among school districts in the com- 
parison group, Newark, Washington, DC, Boston, and 
Milwaukee spent more per pupil than Philadelphia. 



Figure 1 —Percentage change in population, 1 980-92 




SOURCE: City and County Data Book, 1994. 
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Figure 2. — Percentage of children below poverty level 



Percent 

50-i 




SOURCE: U.S. Department of Education, National Center for Education Statistics, Common Core of Data, 1 992-93. 



Figure 3. — Cu rrent expenditures per pupil 
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SOURCE: U.S. Department of Education, National Center for Education Statistics, Common Core of Data, 1 992-93. 
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Student/Teacher Ratio 

However, despite Philadelphia’s relatively high current 
expenditures per pupil, among the comparison cities only 
Memphis had a higher student:teacher ratio (figure 4). 
' In 1992-93, the most recent year for which we have been 
able to obtain data, the Philadelphia Public Schools had 
an average of 18.5 students per teacher. As illustrated in 
figure 3, Cleveland and Milwaukee had almost identical 
expenditures per pupil during that year, but Cleveland 
had a ratio of only 14.4 students per teacher and Mil- 
waukee had a ratio of 16.6 students per teacher (figure 

4). 

Student Achievement 

The Philadelphia Public School District and the other 
school districts in the comparison group use different 
standardized tests to assess their students; therefore it is 
difficult to compare academic outcomes among the 
school districts. However, students living in most of these 
school districts who intend to go to college take the SAT 
examination. While SAT scores are not the best com- 
parison across school districts because the population 
taking them is self-selected and may vary significantly 
across the district, we can safely say that Philadelphia’s 



average composite SAT scores are in line with those of 
other districts in the comparison group (figure 5). 1 

Comparisons between Philadelphia and 
Other Pennsylvania School Districts 

Choosing within State Schools 

Because much of the data collected by governments in 
the United States is collected at the state level, it is easier 
to compare the Philadelphia Public School District with 
other school districts in the state of Pennsylvania than it 
is to compare it with large city school districts in other 
states. As our comparison group of school districts within 
Pennsylvania, we picked two other city school districts, 
Pittsburgh and Harrisburg, as well as three suburban 
Philadelphia school districts. Pittsburgh and Harrisburg 
were selected because they are the second and third larg- 
est cities in Pennsylvania, after Philadelphia. Two of the 
suburban districts, Cheltenham and Abington, are pri- 
marily middle class, although they include pockets of low- 
income families. They were chosen because they were 
among the largest districts geographically close to Phila- 
delphia. One of the suburban districts, Chester-Upland, 
is a poor district and was chosen because its poverty rates 
were similar to Philadelphia’s. 



Figure 4. — Student/teacher ratio 




1 Data on SAT scores is from 1995, while data on expenditures, student/teacher ratio, and poverty levels is from 1992-93. We used data 
for the most recent years available. Students’ SAT scores are influenced by their entire educational and life experience, not just their 
experience in the current school year. 
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Poverty Level 

Figure 6 illustrates the percentage of low-income students 
in each of the Pennsylvania school districts in our com- 
parison group. 

Current Expenditures 

As illustrated in figure 7, Philadelphia’s current expendi- 
tures per pupil during 1995—96 were lower than any of 
the other Pennsylvania school districts in the compari- 
son group. Philadelphia spent $6,550 per pupil during 



1995-96. Of the other districts in the Pennsylvania com- 
parison group, only Chester-Upland spent less than 
$7,000 per pupil. Pittsburgh, the second largest city in 
Pennsylvania, spent $9,500 in 1995-96, almost one-third 
more than was spent by Philadelphia. 

Percentage of Budget Spent on Instruction 

Although Philadelphia spent less per pupil in 1995-96 
than the other school districts in the Pennsylvania com- 
parison group, the percentage of its budget that Phila- 



FigureS. — Average composite SAT scores, 1995 



Score 




DC 

SOURCE: Dr. Joyce Ladner/Financing Education in the District of Columbia from the Perspective of the Financial Authority," 
presentation before the American Education Finance Association Annual Conference, March 7, 1997. 



Figure 6. — Percentage of students living in low-income homes in selected Pennsylvania districts, 1996- 




Philadelphia Pittsburgh Harrisburg Abington 
SOURCE: Pennsylvania System of School Assessment School Profiles 1996-97. 
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delphia spent on instructional activities was about aver- 
age for this group of school districts. In 1995-96, Phila- 
delphia allocated 65 percent of its expenditures to in- 
structional activities, compared with a high of 70 per- 
cent for Harrisburg and a low of 62 percent for Pitts- 
burgh (figure 8). 

Drop-out Rates 

Figure 9 illustrates drop-out rates for 7th- to 12th -grad- 
ers in the comparison group of Pennsylvania school dis- 
tricts. Philadelphia and Harrisburg have the highest drop- 
out rates among schools in the Pennsylvania comparison 
group, and Cheltenham and Abington have the lowest 
rates. 

Teacher Absences 

In addition to Philadelphia and Harrisburg having the 
highest drop-out rates of any school district in the Penn- 
sylvania comparison group, teachers in these two school 
districts are more likely than teachers in other schools in 
the Pennsylvania comparison group to be absent for per- 
sonal reasons. As figure 10 shows, on any given day 6.5 
percent of Philadelphia teachers and 6.8 percent of Har- 
risburg teachers were absent for personal reasons, com- 
pared with 5 percent of Pittsburgh teachers, 4.7 percent 
of Cheltenham teachers, and 4.9 percent of Abington 
teachers. (Chester- Upland did not report the percentage 
of teachers absent for personal reasons.) Although there 



is no reason to believe that these absences are not legiti- 
mate, the high absence rate in Philadelphia may indicate 
that Philadelphia teachers are somewhat less committed 
than teachers in Pittsburgh or the suburban systems to 
arranging their personal lives so that they are at school as 
much as possible. The high absence rate of Philadelphia 
teachers is certainly a topic worthy of further investiga- 
tion. 

St u den t Achievem en t 

Not surprisingly, students in the two middle class subur- 
ban districts in the sample, Cheltenham and Abington, 
scored better, on average, than students in Philadelphia 
on statewide assessments in mathematics, reading, and 
writing given in 1997. 2 Although Philadelphia Public 
School students had average test scores in 1997 on state- 
wide assessments that were lower than the statewide av- 
erage, their average scores are comparable to those of stu- 
dents in Harrisburg and higher than those of students in 
the Chester-Upland School District (figure 11). How- 
ever, students in Pittsburgh scored, on average, almost 
100 points higher on the 1 997 statewide assessment than 
students in Philadelphia and Harrisburg, despite the fact 
that Pittsburgh’s percentage of low-income children was 
comparable to Harrisburg’s and only slightly lower than 
Philadelphia’s that year (figure 6). As discussed above, 
Pittsburgh’s drop-out rates were also lower than 
Philadelphia’s or Harrisburg’s in 1 996-97. Although it is 



Figure 7. — Current expenditures per pupil for selected Pennsylvania school districts, 1995-96 




SOURCE: Pennsylvania System of School Assessment School Profiles 1996-97. 
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2 We were not able to obtain standard errors for the Pennsylvania state tests, so we cannot determine if differences are statistically significant. 
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Figure 8. — Percentage of expenditures going towa rd instructional activities for Pennsylvania school 
districts, 1 995-96 
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Philadelphia Pittsburgh Harrisburg Abington Cheltenham Chester-Upland 



SOURCE: Pennsylvania System of School Assessment School Profiles, 1996-97. 



Figure 9. — Drop-out rates for selected Pennsylvania school districts, 1996-97 
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Philadelphia Pittsburgh Harrisburg Abington Cheltenham Chester-Upland 
SOURCE: Pennsylvania System of School Assessment School Profiles, 1997-98. 



extremely difficult, if not impossible, to draw causal re- 
lationships between spending per pupil and educational 
outcomes, and the data we have presented here certainly 
do not allow us to do so, the fact that Pittsburgh spent 
$3,000 more per pupil than Philadelphia and SI, 300 
more per pupil than Harrisburg in 1996—97 indicates 
that the relationship between spending and educational 
outcomes should be further investigated in this case (fig- 
ure 7). 



Comparisons between Philadelphia and 
the State of Pennsylvania, and Other Big 
City School Districts and States 

Comparisons between Cities and States 

Although each state measures such indicators as per pu- 
pil spending and drop-out rates somewhat differently, 
these measurements are usually consistent for all school 
districts within a state. Thus, if one school district in a 
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Figure 10. — Percentage ofteachers absent due to personal reasons for selected Pennsylvania school 
districts, 1995-96 
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Philadelphia Pittsburgh Harrisburg Abington Cheltenham 



SOURCE: Pennsylvania System of School Assessment School Profiles, 1996-97. 



Figure 11. — Grade 1 1 mathematics and reading and grade 9 writing assessment scores for selected 
Pennsylvania school districts, 1 995-96 
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■ Mathematics □ Reading 9 Writing 

NOTE: No writing assessment data is available for Chester-Upland School District. 
SOURCE: Pennsylvania System of School Assessment School Profiles, 1996-97. 
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state spends twice as much as another, we can be reason- 
ably confident that each school district is measuring the 
same set of expenditures, and the higher spending dis- 
trict does spend twice as much as the lower spending 
district. If the school districts are in different states, we 
cannot assume that they are measuring the same set of 
expenditures. It is possible that the school district spend- 
ing twice as much is including expenditures in its total, 
such as capital expenditures, that the lower spending 
school district in a different state does not include. 



By reporting a ratio of city spending to state spending, 
we can compare cities in different states, despite the fact 
that states measure spending differently. For example, if 
one city spends more than its state average, and another 
city spends less than its state average, we can infer that 
the first city is probably making a bigger effort to meet 
the needs of its students than the second city. 

Similarly, states measure drop-out rates differently, but 
cities within a given state usually measure drop-outs rates 
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consistently. Some states report drop-out rates for 7th- 
to 1 2th-graders, some states report drop-out rates for 9th- 
to 1 2th- graders, and other states report drop-out rates 
for 1st- to 12th-graders. The drop-out rate for 1st- to 
12th-graders is always lower in any given school district 
than the drop-out rate for 9th- to 12th-graders, because 
so few students drop-out of school before 9th grade. 

By reporting a ratio of city drop-out rates to state drop- 
out rates, we can compare cities in different states. If one 
city has a drop-out rate twice its state average, and an- 
other city has a drop-out rate equal to its state average, 
the first city probably has a bigger drop-out problem, 
even if the two states measure drop-out rates somewhat 
differently. 

Spending Per Pupil 

Philadelphia is one of only two big city school districts 
in our comparison group that does not spend more per 
pupil than the statewide average (figure 12). Generally, 
expenses are higher for big city school districts than for 



the state as a whole, because cities usually have higher 
labor costs than suburban or rural school districts. In 
addition, the pupils in city schools usually have higher 
needs than the pupils in rural or suburban schools. As 
illustrated in figure 2, most city schools have high per- 
centages of children living below the poverty line whose 
educational needs are often great. Therefore, it is trou- 
bling that Philadelphia is spending only the state average 
per pupil. 

Ratio of City School District Administrative 
Expenditures to Instructional Expenditures 

As illustrated in figure 13, Philadelphia’s ratio of admin- 
istrative to instructional expenses is low among the com- 
parison cities for which we were able to obtain data. 
Philadelphia’s spending on administrative expenses is 
equal to approximately 1 1 percent of its spending on in- 
structional expenses. Among our comparison cities, only 
Baltimore has a lower ratio of administrative to instruc- 
tional expenditures. 



Figure 1 2. — Ratio of average per pupil spending in city school districts to average per pupil spending in 
states 




NOTE: Ratios greater than 1.0 indicate average per pupil spending by a city school district is greater than average per pupil spending 
at the state level. Ratios equal to 1 .0 indicate that average per pupil spending by a city equals average per pupil spending at the state 
level. Philadelphia: Figures may not be exact due to incomplete reporting. Baltimore: Cost per pupil reflects the average cost of 
providing educational and related services to the students of the local school system. Philadelphia, Baltimore, and Chicago: Data 
reflect the 1996-97 school year. Camden: Data reflect the 1997-98 school year. Cleveland: Data reflect FY97 figures. Memphis: Data 
reflect 1996-97 Operating Expenditures per Student. Newark: Data reflect the 1997-98 school year. 

SOURCE: Philadelphia: Data taken from 1996-97 Pennsylvania Department of Education database. Baltimore: Data taken from 1997- 
98 Maryland State Department of Education Fact Book. Camden: Data taken from the New Jersey Department of Education 
Comparative Spending Guide. Chicago: Data taken from the 1997 School Report Card issued by the Illinois Board of Education. 
Cleveland: Data taken from the Cleveland City School District Profile distributed by the Ohio Department of Education. Memphis: 
Data taken from the Tennessee Department of Education 1 997 Report Card. Newark: Data taken from the New Jersey Department of 
Education Comparative Spending Guide. 
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Teacher/Pupil Ratio 

In addition to spending as much or more per pupil as the 
state average, the city school districts in our comparison 
group have teacherrpupil ratios that are as high, or slightly 
- higher, than the state average (figure 14). Philadelphia’s 
teacher/pupil ratio is approximately 10 percent higher 
than the state of Pennsylvania’s teachenpupil ratio. Both 
Cleveland and Memphis have higher teachenpupil ratios 
in relation to their state average than Philadelphia. Cleve- 
land has a teachenpupil ratio that is 20 percent higher 
than the Ohio average, and Memphis has a teachenpupil 
ratio that is 30 percent higher than the Tennessee average 
(figure 14). 

Drop-out Rates 

Approximately three students in Philadelphia drop out 
of school before finishing 12th grade for each student in 



the state of Pennsylvania who drops out of school before 
finishing 12th grade (figure 15). Among the schools in 
our comparison group for which we were able to obtain 
drop-out rate data, only Cleveland has a higher ratio of 
city drop-out rate to state drop-out rate. 

Conclusion 

None of the factors used to compare Philadelphia with 
other large, high poverty cities indicates that the Phila- 
delphia Public School District is doing a significandy 
worse job than the school districts in these other cities. 
Philadelphia’s student/teacher ratio is at the high end for 
this group, but it is not the highest. Philadelphia’s cur- 
rent expenditures per pupil are in the middle of the group 
of larger city comparison districts. The fact that 
Philadelphia’s average SAT scores are in line with those 
in other large cities in the comparison group indicates 



Figure 1 3.— Ratio of city school district administrative expenditures to instructional expenditures 



■ - 1 1 1 

Philadelphia Baltimore Camden Cleveland Newark 

NOTE: Data for Chicago Public Schools and Memphis City School District are not available. Philadelphia: Data from the 1996-97 
school year. Baltimore: Administration includes system wide regulation, direction, and control of the Local Education Agency (LEA), 
including office of the superintendent, business services, centralized support services, and instructional direction and improvement 
services. Instruction includes activities that address teaching regular students or enhancing the educational experience for students. 
Included in this category are classroom instruction, excluding special education services; school media services; cocurricular 
activities; office of the principal; guidance services; and psychological services. Camden: Data reflect the 1997-98 school year. 
Cleveland: Data reflect FY97 figures. Newark: Data reflect the 1 997-98 school year. 

SOURCE: Philadelphia: Data taken from Pennsylvania Department of Education for the 1996-97 school year. Baltimore: Data taken 
from Maryland State Department of Education for the 1 996-97 school year. Camden: Data taken from New Jersey Department of 
Education Comparative Spending Guide. Cleveland: Data taken from the Cleveland City School District Profile distributed by the Ohio 
Department of Education. Newark: Data taken from New Jersey Department of Education Comparative Spending Guide. 
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Figure 14. — Ratio of city school district teacher/pupil ratio to state teacher/pupil ratio 
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Philadelphia Baltimore Chicago Cleveland Memphis 



NOTE: Data for Camden City School District and Newark City School District are not available. Philadelphia: Data include teachers at 
all levels and in all areas. Cleveland: Data reflect FY97 figures. 

SOURCE: Philadelphia: Data taken from 1996-97 figures regarding total number of teachers and total enrollment. This includes 
teachers at all levels and in all areas. Baltimore: Data taken from 1996-97 full-time-equivalent enrollment figures provided by the 
Maryland Department of Education. Chicago: Data taken from the 1997 School Report Card issued by the Illinois Board of Education. 
Cleveland: Data taken from the Cleveland City School District Profile distributed by the Ohio Department of Education. Memphis: 
Data taken from the Tennessee Department of Education 1995-96 School Year Annual Statistical Report. 



Figure 1 5.— Ratio of city school district drop out rates to state drop out rates 
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Philadelphia Baltimore Chicago Cleveland Memphis 



NOTE: Data for Newark City Schools and Camden City Schools are not available. Philadelphia: Figures reflect the annual rate. 
Baltimore: Drop-out rate reflects the rate for students grades 9-12 grade for the 1996-97 school year. Cleveland: Data reflect FY97 
figures. Memphis: Data reflect 1 996-97 dropout rates for students in public schools, grades 9-1 2. 

SOURCE: Philadelphia: Data taken from the Pennsylvania Department of Education 1996-97 data base. Baltimore: Data taken from 
1 997-98 Maryland State Department of Education Fact Book. Chicago: Data taken from the 1 997 School Report Card issued by the 
Illinois Board of Education. Cleveland: Data taken from the Cleveland City School District Profile distributed by the Ohio Department 
of Education. Memphis: Data taken from the Tennessee Department of Education 1997 Report Card. 
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that the students in Philadelphia who are considering 
applying to college are receiving an education that is com- 
parable with those of students in the other large cities as 
measured by the SAT examination. 

On average, large city schools do have students with 
greater needs than students in other schools in their state. 
Socioeconomic status has been repeatedly shown to be 
highly correlated with student achievement, and large city 
schools generally have students with lower average socio- 
economic status than suburban schools. Thus, the fact 
that Philadelphia is one of only two big city school sys- 
tems in the comparison group that spends less than the 
state average per pupil is somewhat troubling, as it indi- 
cates that all the needs of Philadelphia’s students may not 
be adequately met. However, the fact that Philadelphia 
is spending a lower percentage of its budget on adminis- 
tration than all but one of the other big city schools indi- 
cates that expenditures that directly impact students in 
the classroom may not be as deficient as they first appear. 

As expected, Philadelphia appears in a less favorable light 
when compared with wealthier suburban Pennsylvania 
school districts. Philadelphia’s drop-out rate is higher, 
and its test scores are lower, than the two neighboring 
districts in the comparison group. Philadelphia also has 
a higher teacher absentee rate than these districts, which 
can be an indicator of teachers’ lack of commitment to 
their job and students. 



However, when compared with a high poverty neighbor- 
ing suburban district, Philadelphia did not appear to per- 
form nearly so poorly. While Philadelphia’s drop-out rate 
was higher than in the neighboring high poverty subur- 
ban school, its average test scores on the Pennsylvania 
state tests were also higher. 

Finally, when compared with the other two large cities in 
Pennsylvania, Pittsburgh and Harrisburg, Philadelphia’s 
performance did not appear unusually poor. Students in 
Philadelphia have average test scores comparable with 
Harrisburg, although they were lower than the average 
test scores in Pittsburgh. Philadelphia’s drop-out rates 
and percentage of teachers absent for personal reasons 
were also comparable with Harrisburg’s and lower than 
Pittsburgh’s. 

While the data discussed above indicate that there is prob- 
ably substantial room for improvement in the Philadel- 
phia public schools, both in terms of inputs such as ex- 
penditures per pupil, and in terms of outcomes such as 
drop-out rates and test scores, the data do not indicate 
that Philadelphia’s performance is substantially divergent 
from what one would expect to see in a large, high pov- 
erty school district. Drastic measure may indeed be re- 
quired to improve the performance of the Philadelphia 
public schools, but if they are, schools in other cities also 
require similar intervention. 



School Choice in Milwaukee: 
Are Private Schools Creaming 
Off the Best Students? 



Dan D. Go Id h a be r 

The Urban Institute 



Dominic J. Brewer 

RAND 



About the Authors 



Dan D. Goldhaber is a labor economist who serves as a 
Research Associate at the Urban Institute in the Educa- 
tion Policy Center and as a member of the Alexandria 
City School Board. His research focuses on educational 
productivity and reform at the K— 12 level and on teacher 
labor markets. Examples of work in these areas include 
analyses of the demographic and productivity impacts of 
educational vouchers, the effects of teacher qualifications 
and quality on student achievement, and the effect of the 
interaction between student and teacher race, gender, and 
ethnicity on student outcomes. He has published in nu- 
merous academic economics and education journals in- 
cluding the Journal of Human Resources , the Journal of 
Urban Economics , Economics of Education Review, Educa- 
tion Economics, Industrial and Labor Relations Review, and 
Phi Delta Kappan . Dr. Goldhaber has served as a re- 
viewer for numerous academic journals and presented his 
research at professional meetings such as the American 
Economic Association, the American Educational Re- 
search Association, the Association for Public Policy and 
Management, and the American Educational Finance 
Association. Dr. Goldhaber received a Ph.D. and M.S. 





EricR.Bde 

Brigham Young University 



in Labor Economics from Cornell University and a B.A. 
in economics from the University of Vermont. 

Dominic J. Brewer is a labor economist at RAND spe- 
cializing in the economics of education and the Director 
of RAND Education. In recent years his research has 
focused on educational productivity and teacher incen- 
tives in both K-12 and higher education. His research 
has examined policy issues through the analysis of large 
national databases including the original Coleman Report 
data, High School and Beyond and the National Educa- 
tional Longitudinal Study of 1988, Examples of this work 
include an analysis of the effects of teacher education and 
quality on student achievement gains, the interaction 
between student and teacher race, gender and ethnicity, 
and the effects of administrative resources on student per- 
formance. He has completed a series of studies on the 
effects of ability grouping on student achievement using 
an American Education Finance Association/National 
Center for Education Statistics Young Scholars Award. 
Recent higher education research has included a study of 
community college faculty’s connections to the labor 



3EST COPY AVAII ARIF 



Developments in School Finance , 1 998 



market for the National Center for Research in Voca- 
tional Education, and a study of the labor market payoff 
to attending different types of four-year college. He has 
published in numerous academic economics and educa- 
tion journals including Review of Economics and Statis- 
tics , Educational Evaluation and Policy Analysis, Journal of 
Labor Economics , Journal of Human Resources , and Jour- 
nal of Policy Analysis and Management. He is an associate 
editor of Economics of Education. He serves on the Na- 
tional Center for Education Statistics Research Design 
Panel on school finance. Dr. Brewer received a Ph.D. in 
Labor Economics from Cornell in 1994, and holds a 
bachelor’s degree from Oxford. He has been at RAND 
since 1 994 and has also been a Visiting Assistant Profes- 
sor of Economics at UCLA and on the faculty of the 
RAND Graduate School. 

Eric R. Eide is an associate professor in the economics 
department at Brigham Young University. He is a labor 
economist whose research focuses on the economics of 
education and earnings inequality. His early research in 
higher education examined how college major affects an 
individual’s lifetime earnings, as well as how differences 



in college major choice contribute to earnings differen- 
tials among various gender and racial groups. Professor 
Eide has more recently analyzed higher education issues 
such as the labor market premium associated with at- 
tending colleges of differing selectivity, and the effect of 
college education on earnings inequality in the United 
States. Examples of his work on primary and secondary 
education include studies on the long run consequences 
of grade retention, the effect of school spending on the 
distribution of student achievement and labor market 
earnings, and the relationship between participating in 
extracurricular activities and educational attainment and 
labor market outcomes. His research has been published 
in economics journals such as the Journal of Human Re- 
sources , Economics of Education Review, Journal of Popula- 
tion Economics, Southern Economic Journal, Economics 
Letters , and Contemporary Economic Policy. Professor Eide 
has been on the faculty at Brigham Young University since 
1993, where he has taught courses on labor economics, 
econometrics, and statistics. He received a Ph.D. in Eco- 
nomics from the University of California, Santa Barbara 
in 1993, and he completed bachelor’s and master’s de- 
grees at Brigham Young University. 



School Choice in Milwaukee: 
Are Private Schools Creaming 
Off the Best Students? 



Dan D. Gold ha be r 

The Urban Institute 



Introduction 

There has been considerable debate over the results of 
the voucher program that has been in place in Milwau- 
kee since 1990. Researchers who have analyzed the ef- 
fect of participating in the Milwaukee Parental Choice 
Program (MPCP) on student mathematics and reading 
test scores have reached different conclusions. One pos- 
sible explanation for the divergent findings is that there 
are important differences in student characteristics be- 
tween the public and private sectors; namely, that private 
schools are creaming off students from the public schools 
who are most likely to have high academic achievement, 
and that researchers examining this issue have used dif- 
ferent methodologies to account for this. Some of the 
differences in student characteristics, such as family in- 
come and parental education, are readily observable and 
can easily be incorporated into standard statistical mod- 
els. However, in the case of private schooling, statisti- 
cians worry about “sample selection” — the degree to 
which “unobservable” differences between students in the 
two sectors generate biased estimates of the effects of pri- 
vate schooling. 
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Sample selection occurs when characteristics of students 
that influence academic achievement and are unobserv- 
able in the data are systematically related to the school 
sector in which they are enrolled. 1 We might expect 
sample selection to be present in the case of private school- 
ing because parents freely choose the school sector in 
which their children are enrolled. This fact suggests that 
parents who choose the private sector may be quite dif- 
ferent from other parents, but in ways that are difficult 
to identify. For instance, in most cases parents have to 
pay additional monies, both in tuition and in transpor- 
tation costs, to send their children to private schools. 
These parents demonstrate a willingness to support edu- 
cation which could indicate that they also provide an 
environment in the home which is conducive to educa- 
tional achievement; factors which are difficult to account 
for statistically. 

If there are important “unobservable” differences, then 
standard statistical models (ordinary least squares) of 
achievement are inadequate. The problem faced is that 



See Goldhaber (1996) and Figlio and Stone (1997) for a detailed discussion of sample selection associated with private schooling. 
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it cannot be determined whether differences in achieve- 
ment between public and private school students are due 
to genuine differences in performance attributable to 
school type or to underlying differences in motivation, 
home environment, etc. Thus, central to the debate over 
the effects of school choice is the issue of sample selec- 
tion. 

Data from the Milwaukee school choice experiment al- 
low researchers to explore the issue of selection because 
there are public school students who had the opportu- 
nity to apply to participate in a voucher program but 
chose not to, and there are students in the public sector 
who applied for the voucher program but were rejected 
from it. Here, we examine the Milwaukee data to inves- 
tigate the extent to which those students who enrolled in 
the Milwaukee choice program differed from their pub- 
lic school counterparts in terms of both “observed” and 
“unobserved” characteristics. We begin with a discus- 
sion of the school choice program in 
Milwaukee. 

School Choice In 
Milwaukee 

The Milwaukee Parental Choice Pro- 
gram began in the fall of 1990. The 
parameters of the MPCP are as follows. 

Students enrolled in the Milwaukee 
public school (MPS) district who came 
from families with incomes not exceed- 
ing 1.75 times the national poverty line, 
and who were not enrolled in a private 
school in the immediate prior year, were 
eligible to attend private, nonsectarian 
schools in the district (Witte 1997). 

The total number of choice students in any year was lim- 
ited to 1 percent of the MPS membership in the first 
four years, and was increased to 1.5 percent for the 1994— 
95 school year. 2 For each choice student who enrolled, 
their respective private school received a payment equiva- 



lent to the MPS per student state aid (about $3,200 in 
1994—95). Private schools were required to limit choice 
students to 49 percent of their enrollment (this figure 
rose to 65 percent beginning in the 1994—95 school year) 
and schools that were oversubscribed were required to 
accept students based on a random selection. 3 This last 
provision provides a “natural experiment” because sev- 
eral of the schools were oversubscribed, resulting in the 
random assignment of students between the public and 
private sectors. In other words, in several cases there was 
a greater demand for participation in private schools in 
the MPCP than there were slots available (given the speci- 
fications of the program) in those schools. 4 The ran- 
domness of program participation allows researchers to 
compare students who applied for admission through the 
MPCP, were not selected through the lottery process, and 
therefore attended a public school, to those who applied 
through the MPCP, were selected through the lottery 
process, and attended private school. 5 In theory, this 
approach avoids the selection problem 
discussed in the previous section. 

A great deal of effort has been spent 
collecting student-level data for the 
evaluation of the Milwaukee experi- 
ment. Family background information 
was solicited from all participating stu- 
dents and for a large random sample 
of nonparticipating public school stu- 
dents, and tests were administered in 
the spring to both program participants 
and nonparticipants in grades K-8. 6 
The tests administered are the Iowa 
Tests of Basic Skills (ITBS) which is a 
nationally normed standardized test 
with scores ranging from 1 to 99 with 
a national mean of 50. Several researchers have exam- 
ined these data to determine the relative effectiveness of 
public and private schools in Milwaukee. 



[D]ifferences in achieve- 
ment between public and 
private school students 
are due to genuine 
differences in perfor- 
mance ... or to underlying 
differences in motivation, 
home environment, etc. 



2 The program began with an enrollment of 341 students in 7 schools. By 1995, enrollment had risen to 830 students, with 12 schools 
participating in the program (Witte 1997). 

3 Schools were required to admit choice students without discrimination based on race, ethnicity, or prior academic performance, but were 
not required to admit disabled students (Witte 1997). 

4 The school choice program was amended in June 1995. For details of the changes to the program see Witte (1997). 

5 However, as Witte (1997) notes, there was no authority ensuring that the selection was, in fact, random. 

6 In total, approximately 4,000 student-year observations were collected. However, students who were observed in multiple years contributed 
more than a single observation. 
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